Toward Conditional Models of Identity Uncertainty with Application to Proper Noun Coreference
نویسندگان
چکیده
Coreference analysis, also known as record linkage or identity uncertainty, is a difficult and important problem in natural language processing, databases, citation matching and many other tasks. This paper introduces several discriminative, conditionalprobability models for coreference analysis, all examples of undirected graphical models. Unlike many historical approaches to coreference, the models presented here are relational—they do not assume that pairwise coreference decisions should be made independently from each other. Unlike other relational models of coreference that are generative, the conditional model here can incorporate a great variety of features of the input without having to be concerned about their dependencies— paralleling the advantages of conditional random fields over hidden Markov models. We present experiments on proper noun coreference in two text data sets, showing results in which we reduce error by nearly 28% or more over traditional thresholded record-linkage, and by up to 33% over an alternative coreference technique previously used in natural language processing.
منابع مشابه
Conditional Models of Identity Uncertainty with Application to Noun Coreference
Coreference analysis, also known as record linkage or identity uncertainty, is a difficult and important problem in natural language processing, databases, citation matching and many other tasks. This paper introduces several discriminative, conditional-probability models for coreference analysis, all examples of undirected graphical models. Unlike many historical approaches to coreference, the...
متن کاملObject Consolodation by Graph Partitioning with a Conditionally-Trained Distance Metric
Coreference analysis, also known as record linkage, object consolidation or identity uncertainty, is a difficult and important problem in natural language processing, databases, citation matching and many other tasks. This paper introduces several discriminative, conditional-probability models for coreference analysis, all examples of undirected graphical models. Unlike many historical approach...
متن کاملRevisiting the Effects of Growth Uncertainty on Inflation in Iran:An Application of GARCH-in-Mean Models
This paper investigates the relationship between inflation and growth uncertainty in Iran for the period of 1988-2008 by using quarterly data. We employ Generalized Autoregressive Conditional Heteroscedasticity in Mean (GARCH-M) model to estimate time-varying conditional residual variance of growth, as a standard measures of growth uncertainty. The empirical evidence shows that growth uncertain...
متن کاملFuzzy Coreference Resolution for Summarization
We present a fuzzy-theory based approach to coreference resolution and its application to text summarization. Automatic determination of coreference between noun phrases is fraught with uncertainty. We show how fuzzy sets can be used to design a new coreference algorithm which captures this uncertainty in an explicit way and allows us to define varying degrees of coreference. The algorithm is e...
متن کاملAn Integrated, Conditional Model of Information Extraction and Coreference with Application to Citation Matching
Although information extraction and coreference resolution appear together in many applications, most current systems perform them as independent steps. This paper describes an approach to integrated inference for extraction and coreference based on conditionally-trained undirected graphical models. We discuss the advantages of conditional probability training, and of a coreference model struct...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2003